Georgia Tech - Jigsaw
 
VAST 2011 Challenge
Mini-Challenge 3 - Investigation into Terrorist Activity



Authors and Affiliations:
Elizabeth Braunstein, Mercyhurst College, ebraun65@lakers.mercyhurst.edu
Carsten Görg, Univ. of Colorado, Denver, carsten.goerg@ucdenver.edu
Zhicheng Liu, Georgia Tech, zliu6@gatech.edu
John Stasko, Georgia Tech, stasko@cc.gatech.edu [PRIMARY Contact]

Tool(s):
We used the Jigsaw visual analytics system developed here in our group at Georgia Tech to work on the problem. Jigsaw is an analysis system to help people working with document collections. It's been in development over the last five years. More about the system can be found at http://www.gvu.gatech.edu/ii/jigsaw.


Video: Video


ANSWERS:
------------------------------------------------------------------------

MC 3.1 Potential Threats: Identify any imminent terrorist threats in the Vastopolis metropolitan area. Provide detailed information on the threat or threats (e.g. who, what, where, when, and how) so that officials can conduct counterintelligence activities. Also, provide a list of the evidential documents supporting your answer.


Process Description
Our analysis process included three key components: data import, data cleaning and annotation, and the actual analysis. These three phases were not strictly separated but intertwined throughout the entire process. For instance, we continued data cleaning while doing analysis.

Data Import
First we had to import the initial files into Jigsaw. We added some code to Jigsaw so that the system would parse each document's text and separate it into sections for the title, date, and article body. This allowed us to read in all the documents into the system. Next, we ran the entity identification process within Jigsaw to find relevant people, organizations, locations, etc. within the documents.

Data Cleaning and Annotation
The entity extraction process produced too many entities (e.g. 21,866 people and 19,184 organizations) to be manageable, including many false positives. To reduce the number of entities we added code to Jigsaw to remove entities only occurring in one or two documents within the system. In general, we thought that such entities were likely either errors or not important to a central plot. If we subsequently found one of these entities to be important, we could add it back in later. This process decreased the number of person entities by almost a factor of 10, resulting in a much more manageable set. We then did further manual cleaning of the data set by removing or correcting wrongly identified entities and adding entities that were missed in the identification process. This process took about a week, working for a few hours per day. Finally, we ran Jigsaw's computational text analyses of the documents to compute summary sentences for each document, similarities across documents, and clusters of related documents.

Investigative Analysis
We then began investigating the documents in more depth. We primarily used the List View (Figure 1) within Jigsaw to explore the different entities in the collection. Lists of entities (by type) can be sorted alphabetically or by the number of documents in which they appear. Selecting entities in the view highlights connected (related) entities. At the same time, we explored sets of documents that were put together into groups within the Cluster View in Jigsaw (Figure 2.) We would identify potential interesting entities and documents via these views and would load the relevant documents into Jigsaw's Document View (Figure 3) for more detailed analysis. This initial investigation did not turn up many good leads. The most common entities from the List View did not appear to be involved in any suspicious activities. Similarly, the clustering did not produce helpful sets of documents.


Figure 1: List View showing connections from Vastopolis to other entities.


Figure 2: Document Cluster View showing sets of related documents.


Figure 3: Document View showing containing a number of the documents crucial to the plot, with 03212 in focus.

By reading many documents in this way, we did begin to notice trends and threads of potential terror/criminal plots, however. We also did a search on relevant terms and we examined the sets of resulting documents Jigsaw loaded for them. This began to give us ideas about what might be going on. It seemed that the majority of documents in the collection were slightly modified former news articles from the late '90's. They didn't seem related to the plot. A smaller number of typically shorter documents involved recent activities at Vastopolis were suspicious, however.

At this point, we decided to examine all of the documents to find ones fitting this pattern. We used Jigsaw's Document View to load all the documents and do a rapid triage-style pass through them, looking for documents meeting this suspicious profile. The Document View allows a person to do this very quickly and we were able to go through the entire collection in about 3-4 hours. This process gave us approximately 60 "suspicious" documents to examine in more detail. We loaded all of these documents into Jigsaw's Calendar View (Figure 4) to see their timing and we read their contents more closely. We used the Tablet window (Figure 5) of Jigsaw to take notes, create timelines, and gather our thoughts about the plot.


Figure 4: Calendar View showing the dates of the small set of suspicious documents.


Figure 5: Tablet window showing our analysis notes including a timeline and the relevant terrorist groups and individuals.




Solution
It appears that terrorists are planning a bioterrorist attack on Vastopolis. The two terrorist organizations to watch are the Paramurderers of Chaos and the Forever Brotherhood of Antarctica. We believe that the terrorists likely will attack the food and water supply either through putting something into the water or contaminating food in food processing plants. We believe that terrorists will use a biological agent such as a spore. The materials for doing this may have been stolen from the lab of Prof. Edward Patino at VAST University. The recent animal deaths both in fields around Vastopolis and in the river raise suspicions about contaminants/poisons/diseases being used. The Citizens for Ethical Treatment of Lab Mice organization has threatened city officials before, and they have been linked to the Forever Brotherhood.

We started to suspect the Forever Brotherhood of Antarctica and the Paramurderers of Chaos because they are linked to bioterror threats. We think that the death of the fish and the animals are related to actions by Paramurderers of Chaos because the police confiscated their lab equipment and the fact that suspicious people had been seen in the areas where the animals started dying. Also, the mayor was a victim of a dognapping done by the terrorist organization Forever Brotherhood of Antarctica.

Relevant articles: 00008, 00878, 01038, 01482, 01785, 01878, 02385, 03040, 03212, 03237, 03295, 03435, 03662, 03740, 04085, 04314